Overview

Dataset statistics

Number of variables23
Number of observations641914
Missing cells8758
Missing cells (%)0.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory482.2 MiB
Average record size in memory787.7 B

Variable types

CAT11
NUM9
BOOL3

Reproduction

Analysis started2020-05-29 21:04:26.547806
Analysis finished2020-05-29 21:07:48.922473
Duration3 minutes and 22.37 seconds
Versionpandas-profiling v2.8.0
Command linepandas_profiling --config_file config.yaml [YOUR_FILE.csv]
Download configurationconfig.yaml

Warnings

transactionDateTime has a high cardinality: 635472 distinct values High cardinality
merchantName has a high cardinality: 2493 distinct values High cardinality
currentExpDate has a high cardinality: 165 distinct values High cardinality
accountOpenDate has a high cardinality: 1826 distinct values High cardinality
dateOfLastAddressChange has a high cardinality: 2186 distinct values High cardinality
customerId is highly correlated with accountNumberHigh correlation
accountNumber is highly correlated with customerIdHigh correlation
enteredCVV is highly correlated with cardCVVHigh correlation
cardCVV is highly correlated with enteredCVVHigh correlation
merchantCountryCode is highly correlated with acqCountryHigh correlation
acqCountry is highly correlated with merchantCountryCodeHigh correlation
transactionDateTime is uniformly distributed Uniform
transactionAmount has 18479 (2.9%) zeros Zeros
cardLast4Digits has 6727 (1.0%) zeros Zeros
currentBalance has 33622 (5.2%) zeros Zeros

Variables

accountNumber
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count5000
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean554770145.8938737
Minimum100547107
Maximum999985343
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum100547107
5-th percentile162363014
Q1322319158
median543887911
Q3786227686
95-th percentile947027654
Maximum999985343
Range899438236
Interquartile range (IQR)463908528

Descriptive statistics

Standard deviation254688449
Coefficient of variation (CV)0.4590882385
Kurtosis-1.25930888
Mean554770145.9
Median Absolute Deviation (MAD)226867261
Skewness0.02824548133
Sum3.561147234e+14
Variance6.486620607e+16
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
318001076100341.6%
 
45604456483821.3%
 
81232811654940.9%
 
83808570351290.8%
 
23987503847050.7%
 
87701710344350.7%
 
27806485342270.7%
 
35321551337560.6%
 
31450627134100.5%
 
91721646932580.5%
 
82220300130460.5%
 
41255888730440.5%
 
90192284030210.5%
 
42889229429830.5%
 
23572167326870.4%
 
77221277926130.4%
 
99076481325940.4%
 
22689697023870.4%
 
52071788923240.4%
 
28905920922750.4%
 
79231729322050.3%
 
83235095620840.3%
 
37783819419780.3%
 
48470539619200.3%
 
71987338119170.3%
 
Other values (4975)55200686.0%
 
ValueCountFrequency (%) 
10054710785< 0.1%
 
10063441424< 0.1%
 
10097386946< 0.1%
 
10119271220< 0.1%
 
10154899329< 0.1%
 
10166023346< 0.1%
 
10168018041< 0.1%
 
10175447631< 0.1%
 
10197090927< 0.1%
 
10208596927< 0.1%
 
ValueCountFrequency (%) 
999985343104< 0.1%
 
99998451532< 0.1%
 
99978907772< 0.1%
 
999275549230< 0.1%
 
9992735018< 0.1%
 
99924637743< 0.1%
 
999116200295< 0.1%
 
99883764432< 0.1%
 
99848057927< 0.1%
 
99803430031< 0.1%
 

customerId
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count5000
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean554770145.8938737
Minimum100547107
Maximum999985343
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum100547107
5-th percentile162363014
Q1322319158
median543887911
Q3786227686
95-th percentile947027654
Maximum999985343
Range899438236
Interquartile range (IQR)463908528

Descriptive statistics

Standard deviation254688449
Coefficient of variation (CV)0.4590882385
Kurtosis-1.25930888
Mean554770145.9
Median Absolute Deviation (MAD)226867261
Skewness0.02824548133
Sum3.561147234e+14
Variance6.486620607e+16
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
318001076100341.6%
 
45604456483821.3%
 
81232811654940.9%
 
83808570351290.8%
 
23987503847050.7%
 
87701710344350.7%
 
27806485342270.7%
 
35321551337560.6%
 
31450627134100.5%
 
91721646932580.5%
 
82220300130460.5%
 
41255888730440.5%
 
90192284030210.5%
 
42889229429830.5%
 
23572167326870.4%
 
77221277926130.4%
 
99076481325940.4%
 
22689697023870.4%
 
52071788923240.4%
 
28905920922750.4%
 
79231729322050.3%
 
83235095620840.3%
 
37783819419780.3%
 
48470539619200.3%
 
71987338119170.3%
 
Other values (4975)55200686.0%
 
ValueCountFrequency (%) 
10054710785< 0.1%
 
10063441424< 0.1%
 
10097386946< 0.1%
 
10119271220< 0.1%
 
10154899329< 0.1%
 
10166023346< 0.1%
 
10168018041< 0.1%
 
10175447631< 0.1%
 
10197090927< 0.1%
 
10208596927< 0.1%
 
ValueCountFrequency (%) 
999985343104< 0.1%
 
99998451532< 0.1%
 
99978907772< 0.1%
 
999275549230< 0.1%
 
9992735018< 0.1%
 
99924637743< 0.1%
 
999116200295< 0.1%
 
99883764432< 0.1%
 
99848057927< 0.1%
 
99803430031< 0.1%
 

creditLimit
Real number (ℝ≥0)

Distinct count10
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10697.210607651492
Minimum250
Maximum50000
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum250
5-th percentile500
Q15000
median7500
Q315000
95-th percentile50000
Maximum50000
Range49750
Interquartile range (IQR)10000

Descriptive statistics

Standard deviation11460.35913
Coefficient of variation (CV)1.07134089
Kurtosis5.364802458
Mean10697.21061
Median Absolute Deviation (MAD)5000
Skewness2.294548813
Sum6866689250
Variance131339831.5
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
500012700119.8%
 
750010534016.4%
 
150009193614.3%
 
100006747710.5%
 
200006430710.0%
 
2500594219.3%
 
50000380395.9%
 
500327515.1%
 
1000278614.3%
 
250277814.3%
 
ValueCountFrequency (%) 
250277814.3%
 
500327515.1%
 
1000278614.3%
 
2500594219.3%
 
500012700119.8%
 
750010534016.4%
 
100006747710.5%
 
150009193614.3%
 
200006430710.0%
 
50000380395.9%
 
ValueCountFrequency (%) 
50000380395.9%
 
200006430710.0%
 
150009193614.3%
 
100006747710.5%
 
750010534016.4%
 
500012700119.8%
 
2500594219.3%
 
1000278614.3%
 
500327515.1%
 
250277814.3%
 

availableMoney
Real number (ℝ)

Distinct count450690
Unique (%)70.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6652.828572659265
Minimum-1244.93
Maximum50000.0
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum-1244.93
5-th percentile164.68
Q11114.97
median3578.165
Q38169.185
95-th percentile19992.9255
Maximum50000
Range51244.93
Interquartile range (IQR)7054.215

Descriptive statistics

Standard deviation9227.132275
Coefficient of variation (CV)1.38694875
Kurtosis9.474952535
Mean6652.828573
Median Absolute Deviation (MAD)2912.235
Skewness2.888834825
Sum4270543800
Variance85139970.03
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
500052360.8%
 
25051190.8%
 
750043090.7%
 
1500042390.7%
 
50033910.5%
 
1000028610.4%
 
250027190.4%
 
2000026710.4%
 
100017610.3%
 
5000013170.2%
 
214.2915< 0.1%
 
4993.9715< 0.1%
 
228.6315< 0.1%
 
7459.5215< 0.1%
 
460.5915< 0.1%
 
4954.4713< 0.1%
 
244.7713< 0.1%
 
4908.9412< 0.1%
 
19940.8912< 0.1%
 
4863.4112< 0.1%
 
2446.1412< 0.1%
 
14946.4811< 0.1%
 
7470.2811< 0.1%
 
212.4411< 0.1%
 
174.8811< 0.1%
 
Other values (450665)60809894.7%
 
ValueCountFrequency (%) 
-1244.931< 0.1%
 
-1112.121< 0.1%
 
-1027.961< 0.1%
 
-973.671< 0.1%
 
-904.641< 0.1%
 
-894.641< 0.1%
 
-875.731< 0.1%
 
-856.541< 0.1%
 
-855.971< 0.1%
 
-815.271< 0.1%
 
ValueCountFrequency (%) 
5000013170.2%
 
49999.941< 0.1%
 
49999.491< 0.1%
 
49999.432< 0.1%
 
49999.361< 0.1%
 
49999.261< 0.1%
 
49999.21< 0.1%
 
49998.721< 0.1%
 
49998.631< 0.1%
 
49998.551< 0.1%
 

transactionDateTime
Categorical

HIGH CARDINALITY
UNIFORM

Distinct count635472
Unique (%)99.0%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2016-12-30T09:23:08
 
3
2016-07-19T13:03:15
 
3
2016-08-07T03:58:57
 
3
2016-06-13T06:30:14
 
3
2016-11-27T14:24:54
 
3
Other values (635467)
641899
ValueCountFrequency (%) 
2016-12-30T09:23:083< 0.1%
 
2016-07-19T13:03:153< 0.1%
 
2016-08-07T03:58:573< 0.1%
 
2016-06-13T06:30:143< 0.1%
 
2016-11-27T14:24:543< 0.1%
 
2016-07-16T16:57:413< 0.1%
 
2016-04-04T20:22:123< 0.1%
 
2016-06-15T22:46:393< 0.1%
 
2016-03-16T19:13:523< 0.1%
 
2016-01-19T04:26:563< 0.1%
 
2016-04-13T19:43:573< 0.1%
 
2016-10-03T17:30:493< 0.1%
 
2016-10-18T18:39:133< 0.1%
 
2016-01-25T11:27:233< 0.1%
 
2016-04-28T08:16:313< 0.1%
 
2016-07-21T12:04:453< 0.1%
 
2016-10-20T13:20:123< 0.1%
 
2016-03-30T13:11:163< 0.1%
 
2016-11-27T12:15:243< 0.1%
 
2016-12-08T08:33:303< 0.1%
 
2016-08-07T22:06:113< 0.1%
 
2016-06-14T22:02:403< 0.1%
 
2016-01-07T09:56:553< 0.1%
 
2016-12-25T20:18:113< 0.1%
 
2016-12-26T08:59:233< 0.1%
 
Other values (635447)641839> 99.9%
 

Length

Max length19
Median length19
Mean length19
Min length19

Overview of Unicode Properties

Unique unicode characters13
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0211118817.3%
 
1189367515.5%
 
2155006612.7%
 
-128382810.5%
 
:128382810.5%
 
69400737.7%
 
T6419145.3%
 
35683694.7%
 
55130844.2%
 
45104084.2%
 
83001992.5%
 
73001212.5%
 
92996132.5%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number898679673.7%
 
Dash Punctuation128382810.5%
 
Other Punctuation128382810.5%
 
Uppercase Letter6419145.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0211118823.5%
 
1189367521.1%
 
2155006617.2%
 
694007310.5%
 
35683696.3%
 
55130845.7%
 
45104085.7%
 
83001993.3%
 
73001213.3%
 
92996133.3%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1283828100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
T641914100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
:1283828100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common1155445294.7%
 
Latin6419145.3%
 

Most frequent Common characters

ValueCountFrequency (%) 
0211118818.3%
 
1189367516.4%
 
2155006613.4%
 
-128382811.1%
 
:128382811.1%
 
69400738.1%
 
35683694.9%
 
55130844.4%
 
45104084.4%
 
83001992.6%
 
73001212.6%
 
92996132.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
T641914100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII12196366100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0211118817.3%
 
1189367515.5%
 
2155006612.7%
 
-128382810.5%
 
:128382810.5%
 
69400737.7%
 
T6419145.3%
 
35683694.7%
 
55130844.2%
 
45104084.2%
 
83001992.5%
 
73001212.5%
 
92996132.5%
 

transactionAmount
Real number (ℝ≥0)

ZEROS

Distinct count62735
Unique (%)9.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean135.16249698557746
Minimum0.0
Maximum1825.25
Zeros18479
Zeros (%)2.9%
Memory size4.9 MiB

Quantile statistics

Minimum0
5-th percentile3.41
Q132.32
median85.8
Q3189.03
95-th percentile430.77
Maximum1825.25
Range1825.25
Interquartile range (IQR)156.71

Descriptive statistics

Standard deviation147.0533021
Coefficient of variation (CV)1.087974145
Kurtosis6.367300632
Mean135.162497
Median Absolute Deviation (MAD)65.45
Skewness2.095715154
Sum86762699.09
Variance21624.67364
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0184792.9%
 
3.42120< 0.1%
 
4.94119< 0.1%
 
8.78118< 0.1%
 
3.56113< 0.1%
 
7.91112< 0.1%
 
33.79112< 0.1%
 
8.49110< 0.1%
 
5.1108< 0.1%
 
53.86104< 0.1%
 
3.47102< 0.1%
 
42.3101< 0.1%
 
5.84101< 0.1%
 
4.55100< 0.1%
 
8.5799< 0.1%
 
6.4498< 0.1%
 
7.6898< 0.1%
 
5.3397< 0.1%
 
8.6696< 0.1%
 
4.2596< 0.1%
 
39.3796< 0.1%
 
8.2196< 0.1%
 
7.8795< 0.1%
 
54.295< 0.1%
 
45.5395< 0.1%
 
Other values (62710)62095496.7%
 
ValueCountFrequency (%) 
0184792.9%
 
0.0135< 0.1%
 
0.0244< 0.1%
 
0.0339< 0.1%
 
0.0436< 0.1%
 
0.0532< 0.1%
 
0.0645< 0.1%
 
0.0741< 0.1%
 
0.0836< 0.1%
 
0.0941< 0.1%
 
ValueCountFrequency (%) 
1825.251< 0.1%
 
1760.361< 0.1%
 
1743.511< 0.1%
 
1692.931< 0.1%
 
1687.481< 0.1%
 
1655.071< 0.1%
 
1633.891< 0.1%
 
1598.941< 0.1%
 
1574.731< 0.1%
 
1571.811< 0.1%
 

merchantName
Categorical

HIGH CARDINALITY

Distinct count2493
Unique (%)0.4%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
Lyft
 
25311
Uber
 
25263
gap.com
 
13824
apple.com
 
13607
target.com
 
13601
Other values (2488)
550308
ValueCountFrequency (%) 
Lyft253113.9%
 
Uber252633.9%
 
gap.com138242.2%
 
apple.com136072.1%
 
target.com136012.1%
 
alibaba.com135832.1%
 
staples.com135122.1%
 
amazon.com134772.1%
 
ebay.com134722.1%
 
discount.com133942.1%
 
oldnavy.com133812.1%
 
walmart.com132822.1%
 
sears.com132792.1%
 
cheapfast.com130572.0%
 
Apple iTunes75791.2%
 
Play Store70351.1%
 
Mobile eCards41690.6%
 
Blue Mountain eCards41650.6%
 
Blue Mountain Online Services41490.6%
 
Fresh eCards41470.6%
 
Next Day Online Services41350.6%
 
Fresh Flowers41120.6%
 
Next Day eCards40840.6%
 
Fresh Online Services40840.6%
 
AMC #60621833710.5%
 
Other values (2468)37884159.0%
 

Length

Max length30
Median length13
Mean length13.8828893
Min length4

Overview of Unicode Properties

Unique unicode characters64
Unique unicode categories (?)6
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
6756777.6%
 
e5309206.0%
 
a5140005.8%
 
o4248064.8%
 
t4181744.7%
 
s4176304.7%
 
n3513593.9%
 
#2876133.2%
 
i2794243.1%
 
c2647333.0%
 
r2561922.9%
 
m2396872.7%
 
l2290322.6%
 
u2091782.3%
 
11929372.2%
 
41861272.1%
 
21859842.1%
 
31768252.0%
 
91743912.0%
 
.1735691.9%
 
61735251.9%
 
51672141.9%
 
81669251.9%
 
71497241.7%
 
y1391561.6%
 
Other values (39)192681921.6%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter501443556.3%
 
Decimal Number171019419.2%
 
Uppercase Letter99993811.2%
 
Space Separator6756777.6%
 
Other Punctuation5000405.6%
 
Dash Punctuation113370.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S10054710.1%
 
P953709.5%
 
C874838.7%
 
A750597.5%
 
M697207.0%
 
B691946.9%
 
D570625.7%
 
F503885.0%
 
U413244.1%
 
R369663.7%
 
H351453.5%
 
E348703.5%
 
G297653.0%
 
W280302.8%
 
L272852.7%
 
N267772.7%
 
Z265952.7%
 
T234432.3%
 
O213732.1%
 
K204392.0%
 
Q171571.7%
 
I154841.5%
 
Y43020.4%
 
V35070.4%
 
J26530.3%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e53092010.6%
 
a51400010.3%
 
o4248068.5%
 
t4181748.3%
 
s4176308.3%
 
n3513597.0%
 
i2794245.6%
 
c2647335.3%
 
r2561925.1%
 
m2396874.8%
 
l2290324.6%
 
u2091784.2%
 
y1391562.8%
 
b1070262.1%
 
p1062022.1%
 
h977781.9%
 
d950721.9%
 
g736961.5%
 
w655251.3%
 
f521391.0%
 
v450740.9%
 
z420390.8%
 
k403600.8%
 
x152330.3%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
675677100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
#28761357.5%
 
.17356934.7%
 
'388587.8%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
119293711.3%
 
418612710.9%
 
218598410.9%
 
317682510.3%
 
917439110.2%
 
617352510.1%
 
51672149.8%
 
81669259.8%
 
71497248.8%
 
01365428.0%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-11337100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin601437367.5%
 
Common289724832.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e5309208.8%
 
a5140008.5%
 
o4248067.1%
 
t4181747.0%
 
s4176306.9%
 
n3513595.8%
 
i2794244.6%
 
c2647334.4%
 
r2561924.3%
 
m2396874.0%
 
l2290323.8%
 
u2091783.5%
 
y1391562.3%
 
b1070261.8%
 
p1062021.8%
 
S1005471.7%
 
h977781.6%
 
P953701.6%
 
d950721.6%
 
C874831.5%
 
A750591.2%
 
g736961.2%
 
M697201.2%
 
B691941.2%
 
w655251.1%
 
Other values (24)69741011.6%
 

Most frequent Common characters

ValueCountFrequency (%) 
67567723.3%
 
#2876139.9%
 
11929376.7%
 
41861276.4%
 
21859846.4%
 
31768256.1%
 
91743916.0%
 
.1735696.0%
 
61735256.0%
 
51672145.8%
 
81669255.8%
 
71497245.2%
 
01365424.7%
 
'388581.3%
 
-113370.4%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII8911621100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
6756777.6%
 
e5309206.0%
 
a5140005.8%
 
o4248064.8%
 
t4181744.7%
 
s4176304.7%
 
n3513593.9%
 
#2876133.2%
 
i2794243.1%
 
c2647333.0%
 
r2561922.9%
 
m2396872.7%
 
l2290322.6%
 
u2091782.3%
 
11929372.2%
 
41861272.1%
 
21859842.1%
 
31768252.0%
 
91743912.0%
 
.1735691.9%
 
61735251.9%
 
51672141.9%
 
81669251.9%
 
71497241.7%
 
y1391561.6%
 
Other values (39)192681921.6%
 

acqCountry
Categorical

HIGH CORRELATION

Distinct count4
Unique (%)< 0.1%
Missing3913
Missing (%)0.6%
Memory size4.9 MiB
US
632303
MEX
 
2626
CAN
 
1870
PR
 
1202
ValueCountFrequency (%) 
US63230398.5%
 
MEX26260.4%
 
CAN18700.3%
 
PR12020.2%
 
(Missing)39130.6%
 

Length

Max length3
Median length2
Mean length2.013099886
Min length2

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
U63230348.9%
 
S63230348.9%
 
n78260.6%
 
a39130.3%
 
M26260.2%
 
E26260.2%
 
X26260.2%
 
C18700.1%
 
A18700.1%
 
N18700.1%
 
P12020.1%
 
R12020.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter128049899.1%
 
Lowercase Letter117390.9%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
U63230349.4%
 
S63230349.4%
 
M26260.2%
 
E26260.2%
 
X26260.2%
 
C18700.1%
 
A18700.1%
 
N18700.1%
 
P12020.1%
 
R12020.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n782666.7%
 
a391333.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1292237100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
U63230348.9%
 
S63230348.9%
 
n78260.6%
 
a39130.3%
 
M26260.2%
 
E26260.2%
 
X26260.2%
 
C18700.1%
 
A18700.1%
 
N18700.1%
 
P12020.1%
 
R12020.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1292237100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
U63230348.9%
 
S63230348.9%
 
n78260.6%
 
a39130.3%
 
M26260.2%
 
E26260.2%
 
X26260.2%
 
C18700.1%
 
A18700.1%
 
N18700.1%
 
P12020.1%
 
R12020.1%
 

merchantCountryCode
Categorical

HIGH CORRELATION

Distinct count4
Unique (%)< 0.1%
Missing624
Missing (%)0.1%
Memory size4.9 MiB
US
635577
MEX
 
2636
CAN
 
1874
PR
 
1203
ValueCountFrequency (%) 
US63557799.0%
 
MEX26360.4%
 
CAN18740.3%
 
PR12030.2%
 
(Missing)6240.1%
 

Length

Max length3
Median length2
Mean length2.007997956
Min length2

Overview of Unicode Properties

Unique unicode characters12
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
U63557749.3%
 
S63557749.3%
 
M26360.2%
 
E26360.2%
 
X26360.2%
 
C18740.1%
 
A18740.1%
 
N18740.1%
 
n12480.1%
 
P12030.1%
 
R12030.1%
 
a624< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter128709099.9%
 
Lowercase Letter18720.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
U63557749.4%
 
S63557749.4%
 
M26360.2%
 
E26360.2%
 
X26360.2%
 
C18740.1%
 
A18740.1%
 
N18740.1%
 
P12030.1%
 
R12030.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n124866.7%
 
a62433.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1288962100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
U63557749.3%
 
S63557749.3%
 
M26360.2%
 
E26360.2%
 
X26360.2%
 
C18740.1%
 
A18740.1%
 
N18740.1%
 
n12480.1%
 
P12030.1%
 
R12030.1%
 
a624< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1288962100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
U63557749.3%
 
S63557749.3%
 
M26360.2%
 
E26360.2%
 
X26360.2%
 
C18740.1%
 
A18740.1%
 
N18740.1%
 
n12480.1%
 
P12030.1%
 
R12030.1%
 
a624< 0.1%
 

posEntryMode
Categorical

Distinct count5
Unique (%)< 0.1%
Missing3345
Missing (%)0.5%
Memory size4.9 MiB
05
255615
09
193193
02
160589
90
 
16251
80
 
12921
ValueCountFrequency (%) 
0525561539.8%
 
0919319330.1%
 
0216058925.0%
 
90162512.5%
 
80129212.0%
 
(Missing)33450.5%
 

Length

Max length3
Median length2
Mean length2.005210978
Min length2

Overview of Unicode Properties

Unique unicode characters7
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
063856949.6%
 
525561519.9%
 
920944416.3%
 
216058912.5%
 
8129211.0%
 
n66900.5%
 
a33450.3%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number127713899.2%
 
Lowercase Letter100350.8%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
063856950.0%
 
525561520.0%
 
920944416.4%
 
216058912.6%
 
8129211.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n669066.7%
 
a334533.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common127713899.2%
 
Latin100350.8%
 

Most frequent Common characters

ValueCountFrequency (%) 
063856950.0%
 
525561520.0%
 
920944416.4%
 
216058912.6%
 
8129211.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n669066.7%
 
a334533.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1287173100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
063856949.6%
 
525561519.9%
 
920944416.3%
 
216058912.5%
 
8129211.0%
 
n66900.5%
 
a33450.3%
 

posConditionCode
Categorical

Distinct count3
Unique (%)< 0.1%
Missing287
Missing (%)< 0.1%
Memory size4.9 MiB
01
514144
08
121507
99
 
5976
ValueCountFrequency (%) 
0151414480.1%
 
0812150718.9%
 
9959760.9%
 
(Missing)287< 0.1%
 

Length

Max length3
Median length2
Mean length2.0004471
Min length2

Overview of Unicode Properties

Unique unicode characters6
Unique unicode categories (?)2
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
063565149.5%
 
151414440.0%
 
81215079.5%
 
9119520.9%
 
n574< 0.1%
 
a287< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number128325499.9%
 
Lowercase Letter8610.1%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
063565149.5%
 
151414440.1%
 
81215079.5%
 
9119520.9%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n57466.7%
 
a28733.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Common128325499.9%
 
Latin8610.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
063565149.5%
 
151414440.1%
 
81215079.5%
 
9119520.9%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n57466.7%
 
a28733.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1284115100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
063565149.5%
 
151414440.0%
 
81215079.5%
 
9119520.9%
 
n574< 0.1%
 
a287< 0.1%
 
Distinct count19
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
online_retail
161469
fastfood
101196
entertainment
69138
food
68245
rideshare
50574
Other values (14)
191292
ValueCountFrequency (%) 
online_retail16146925.2%
 
fastfood10119615.8%
 
entertainment6913810.8%
 
food6824510.6%
 
rideshare505747.9%
 
online_gifts330455.1%
 
hotels228793.6%
 
fuel225663.5%
 
subscriptions183762.9%
 
personal care169172.6%
 
mobileapps146142.3%
 
health143442.2%
 
online_subscriptions112471.8%
 
auto101471.6%
 
airline99901.6%
 
furniture78131.2%
 
food_delivery49900.8%
 
gym28740.4%
 
cable/phone14900.2%
 

Length

Max length20
Median length10
Mean length9.886609421
Min length3

Overview of Unicode Properties

Unique unicode characters23
Unique unicode categories (?)4
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e81479212.8%
 
n68476910.8%
 
o65029310.2%
 
i6266309.9%
 
t5879309.3%
 
l4750207.5%
 
a4667967.4%
 
r4258186.7%
 
f3390515.3%
 
s3280945.2%
 
d2299953.6%
 
_2107513.3%
 
h1036311.6%
 
m866261.4%
 
u779621.2%
 
p772581.2%
 
c480300.8%
 
b457270.7%
 
g359190.6%
 
169170.3%
 
y78640.1%
 
v49900.1%
 
/1490< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter611719596.4%
 
Connector Punctuation2107513.3%
 
Space Separator169170.3%
 
Other Punctuation1490< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e81479213.3%
 
n68476911.2%
 
o65029310.6%
 
i62663010.2%
 
t5879309.6%
 
l4750207.8%
 
a4667967.6%
 
r4258187.0%
 
f3390515.5%
 
s3280945.4%
 
d2299953.8%
 
h1036311.7%
 
m866261.4%
 
u779621.3%
 
p772581.3%
 
c480300.8%
 
b457270.7%
 
g359190.6%
 
y78640.1%
 
v49900.1%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_210751100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
16917100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/1490100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin611719596.4%
 
Common2291583.6%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e81479213.3%
 
n68476911.2%
 
o65029310.6%
 
i62663010.2%
 
t5879309.6%
 
l4750207.8%
 
a4667967.6%
 
r4258187.0%
 
f3390515.5%
 
s3280945.4%
 
d2299953.8%
 
h1036311.7%
 
m866261.4%
 
u779621.3%
 
p772581.3%
 
c480300.8%
 
b457270.7%
 
g359190.6%
 
y78640.1%
 
v49900.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
_21075192.0%
 
169177.4%
 
/14900.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6346353100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e81479212.8%
 
n68476910.8%
 
o65029310.2%
 
i6266309.9%
 
t5879309.3%
 
l4750207.5%
 
a4667967.4%
 
r4258186.7%
 
f3390515.3%
 
s3280945.2%
 
d2299953.6%
 
_2107513.3%
 
h1036311.6%
 
m866261.4%
 
u779621.2%
 
p772581.2%
 
c480300.8%
 
b457270.7%
 
g359190.6%
 
169170.3%
 
y78640.1%
 
v49900.1%
 
/1490< 0.1%
 

currentExpDate
Categorical

HIGH CARDINALITY

Distinct count165
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
05/2026
 
4209
10/2019
 
4201
08/2020
 
4188
05/2028
 
4186
01/2025
 
4154
Other values (160)
620976
ValueCountFrequency (%) 
05/202642090.7%
 
10/201942010.7%
 
08/202041880.7%
 
05/202841860.7%
 
01/202541540.6%
 
05/202441530.6%
 
08/202841190.6%
 
03/202441170.6%
 
08/201841160.6%
 
03/201941150.6%
 
07/201841090.6%
 
05/202541040.6%
 
10/202341030.6%
 
03/202840990.6%
 
08/202640970.6%
 
12/202140940.6%
 
08/202540910.6%
 
01/202340910.6%
 
01/202140900.6%
 
03/202340810.6%
 
05/203140740.6%
 
10/203140730.6%
 
05/202940680.6%
 
10/202440680.6%
 
03/203040660.6%
 
Other values (140)53904884.0%
 

Length

Max length7
Median length7
Mean length7
Min length7

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0126850228.2%
 
2126330228.1%
 
/64191414.3%
 
14433989.9%
 
31969324.4%
 
91468913.3%
 
81317492.9%
 
71019282.3%
 
51007422.2%
 
61007132.2%
 
4973272.2%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number385148485.7%
 
Other Punctuation64191414.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0126850232.9%
 
2126330232.8%
 
144339811.5%
 
31969325.1%
 
91468913.8%
 
81317493.4%
 
71019282.6%
 
51007422.6%
 
61007132.6%
 
4973272.5%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/641914100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common4493398100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0126850228.2%
 
2126330228.1%
 
/64191414.3%
 
14433989.9%
 
31969324.4%
 
91468913.3%
 
81317492.9%
 
71019282.3%
 
51007422.2%
 
61007132.2%
 
4973272.2%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4493398100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0126850228.2%
 
2126330228.1%
 
/64191414.3%
 
14433989.9%
 
31969324.4%
 
91468913.3%
 
81317492.9%
 
71019282.3%
 
51007422.2%
 
61007132.2%
 
4973272.2%
 

accountOpenDate
Categorical

HIGH CARDINALITY

Distinct count1826
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2015-12-11
 
10137
2012-10-05
 
8382
2011-05-20
 
5494
2015-09-24
 
5478
2015-03-12
 
5398
Other values (1821)
607025
ValueCountFrequency (%) 
2015-12-11101371.6%
 
2012-10-0583821.3%
 
2011-05-2054940.9%
 
2015-09-2454780.9%
 
2015-03-1253980.8%
 
2013-08-2448360.8%
 
2014-01-3147530.7%
 
2015-06-1546120.7%
 
2013-07-0437800.6%
 
2014-09-1833110.5%
 
2015-01-1032090.5%
 
2014-08-2731620.5%
 
2014-01-1130810.5%
 
2015-12-2630510.5%
 
2013-09-0930460.5%
 
2014-05-2429830.5%
 
2014-06-0627150.4%
 
2013-06-1526870.4%
 
2014-05-2526140.4%
 
2015-01-1225840.4%
 
2012-10-3024110.4%
 
2015-05-2024070.4%
 
2015-06-2623950.4%
 
2015-03-3022850.4%
 
2015-03-1622720.4%
 
Other values (1801)54483184.9%
 

Length

Max length10
Median length10
Mean length10
Min length10

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0144100122.4%
 
-128382820.0%
 
1125145419.5%
 
2108895517.0%
 
53904956.1%
 
42556964.0%
 
32370233.7%
 
61215561.9%
 
91203941.9%
 
81200161.9%
 
71087221.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number513531280.0%
 
Dash Punctuation128382820.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0144100128.1%
 
1125145424.4%
 
2108895521.2%
 
53904957.6%
 
42556965.0%
 
32370234.6%
 
61215562.4%
 
91203942.3%
 
81200162.3%
 
71087222.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1283828100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common6419140100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0144100122.4%
 
-128382820.0%
 
1125145419.5%
 
2108895517.0%
 
53904956.1%
 
42556964.0%
 
32370233.7%
 
61215561.9%
 
91203941.9%
 
81200161.9%
 
71087221.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6419140100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0144100122.4%
 
-128382820.0%
 
1125145419.5%
 
2108895517.0%
 
53904956.1%
 
42556964.0%
 
32370233.7%
 
61215561.9%
 
91203941.9%
 
81200161.9%
 
71087221.7%
 

dateOfLastAddressChange
Categorical

HIGH CARDINALITY

Distinct count2186
Unique (%)0.3%
Missing0
Missing (%)0.0%
Memory size4.9 MiB
2016-07-20
 
3948
2016-03-15
 
3800
2016-01-26
 
3140
2016-01-29
 
3033
2016-04-25
 
2954
Other values (2181)
625039
ValueCountFrequency (%) 
2016-07-2039480.6%
 
2016-03-1538000.6%
 
2016-01-2631400.5%
 
2016-01-2930330.5%
 
2016-04-2529540.5%
 
2016-07-2229430.5%
 
2016-06-1228100.4%
 
2016-06-0626720.4%
 
2016-04-1126380.4%
 
2016-01-2025990.4%
 
2016-01-1624000.4%
 
2016-07-1823680.4%
 
2016-06-1623370.4%
 
2016-08-0423320.4%
 
2016-05-1723260.4%
 
2016-08-0123220.4%
 
2016-08-0222310.3%
 
2016-03-1622010.3%
 
2016-06-1421700.3%
 
2016-05-2521020.3%
 
2016-02-0720940.3%
 
2016-03-2620220.3%
 
2016-02-2520210.3%
 
2016-05-1120080.3%
 
2016-02-1319600.3%
 
Other values (2161)57848390.1%
 

Length

Max length10
Median length10
Mean length10
Min length10

Overview of Unicode Properties

Unique unicode characters11
Unique unicode categories (?)2
Unique unicode scripts (?)1
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
0146530522.8%
 
-128382820.0%
 
1118313118.4%
 
2105472416.4%
 
64070396.3%
 
52786084.3%
 
32044393.2%
 
41986163.1%
 
81180591.8%
 
71158941.8%
 
91094971.7%
 

Most occurring categories

ValueCountFrequency (%) 
Decimal Number513531280.0%
 
Dash Punctuation128382820.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
0146530528.5%
 
1118313123.0%
 
2105472420.5%
 
64070397.9%
 
52786085.4%
 
32044394.0%
 
41986163.9%
 
81180592.3%
 
71158942.3%
 
91094972.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-1283828100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Common6419140100.0%
 

Most frequent Common characters

ValueCountFrequency (%) 
0146530522.8%
 
-128382820.0%
 
1118313118.4%
 
2105472416.4%
 
64070396.3%
 
52786084.3%
 
32044393.2%
 
41986163.1%
 
81180591.8%
 
71158941.8%
 
91094971.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII6419140100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
0146530522.8%
 
-128382820.0%
 
1118313118.4%
 
2105472416.4%
 
64070396.3%
 
52786084.3%
 
32044393.2%
 
41986163.1%
 
81180591.8%
 
71158941.8%
 
91094971.7%
 

cardCVV
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count899
Unique (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean557.1999270930373
Minimum100
Maximum998
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum100
5-th percentile148
Q1334
median581
Q3762
95-th percentile954
Maximum998
Range898
Interquartile range (IQR)428

Descriptive statistics

Standard deviation257.3262041
Coefficient of variation (CV)0.46182024
Kurtosis-1.147292612
Mean557.1999271
Median Absolute Deviation (MAD)214
Skewness-0.07758885003
Sum357674434
Variance66216.7753
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
633113541.8%
 
74688861.4%
 
62576261.2%
 
31265831.0%
 
98664641.0%
 
67657800.9%
 
73149790.8%
 
18040390.6%
 
46538690.6%
 
65438080.6%
 
32436330.6%
 
81535670.6%
 
71335290.5%
 
46735180.5%
 
38333570.5%
 
14828920.5%
 
91028180.4%
 
87427510.4%
 
92127330.4%
 
13526570.4%
 
16626430.4%
 
43926080.4%
 
57625730.4%
 
12625030.4%
 
95124400.4%
 
Other values (874)53430483.2%
 
ValueCountFrequency (%) 
1004310.1%
 
101116< 0.1%
 
1023430.1%
 
10377< 0.1%
 
10417700.3%
 
10510850.2%
 
10611190.2%
 
107300< 0.1%
 
10821930.3%
 
1097810.1%
 
ValueCountFrequency (%) 
998209< 0.1%
 
9973780.1%
 
996136< 0.1%
 
995259< 0.1%
 
9943310.1%
 
99314240.2%
 
9925700.1%
 
9915190.1%
 
990278< 0.1%
 
9896990.1%
 

enteredCVV
Real number (ℝ≥0)

HIGH CORRELATION

Distinct count980
Unique (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean556.775159912387
Minimum1
Maximum998
Zeros0
Zeros (%)0.0%
Memory size4.9 MiB

Quantile statistics

Minimum1
5-th percentile148
Q1333
median580
Q3761
95-th percentile954
Maximum998
Range997
Interquartile range (IQR)428

Descriptive statistics

Standard deviation257.4026393
Coefficient of variation (CV)0.4623098476
Kurtosis-1.146706164
Mean556.7751599
Median Absolute Deviation (MAD)214
Skewness-0.0771430696
Sum357401770
Variance66256.11874
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
633112541.8%
 
74688161.4%
 
62575591.2%
 
31265241.0%
 
98663991.0%
 
67657450.9%
 
73149230.8%
 
18040070.6%
 
46538250.6%
 
65437750.6%
 
32436030.6%
 
81535290.5%
 
46735030.5%
 
71334950.5%
 
38333450.5%
 
14828640.4%
 
91027890.4%
 
87427430.4%
 
92127040.4%
 
13526400.4%
 
16626220.4%
 
43925900.4%
 
57625690.4%
 
12625080.4%
 
95124250.4%
 
Other values (955)53515883.4%
 
ValueCountFrequency (%) 
11< 0.1%
 
22< 0.1%
 
31< 0.1%
 
41< 0.1%
 
52< 0.1%
 
62< 0.1%
 
75< 0.1%
 
82< 0.1%
 
94< 0.1%
 
104< 0.1%
 
ValueCountFrequency (%) 
998208< 0.1%
 
9973780.1%
 
996139< 0.1%
 
995259< 0.1%
 
9943280.1%
 
99314100.2%
 
9925690.1%
 
9915190.1%
 
990279< 0.1%
 
9896920.1%
 

cardLast4Digits
Real number (ℝ≥0)

ZEROS

Distinct count5134
Unique (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4886.18404334537
Minimum0
Maximum9998
Zeros6727
Zeros (%)1.0%
Memory size4.9 MiB

Quantile statistics

Minimum0
5-th percentile359
Q12364
median4873
Q37267
95-th percentile9484
Maximum9998
Range9998
Interquartile range (IQR)4903

Descriptive statistics

Standard deviation2859.053679
Coefficient of variation (CV)0.5851301657
Kurtosis-1.145169532
Mean4886.184043
Median Absolute Deviation (MAD)2435
Skewness0.02565465023
Sum3136509944
Variance8174187.938
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
1789100341.6%
 
565884121.3%
 
067271.0%
 
533555420.9%
 
406251460.8%
 
469044350.7%
 
726742270.7%
 
270537660.6%
 
264034140.5%
 
606030460.5%
 
154830210.5%
 
473729830.5%
 
400826150.4%
 
316523870.4%
 
821223790.4%
 
535823440.4%
 
143723260.4%
 
971622750.4%
 
314720840.3%
 
631019780.3%
 
443119560.3%
 
224219170.3%
 
660618750.3%
 
233018550.3%
 
661617540.3%
 
Other values (5109)55341686.2%
 
ValueCountFrequency (%) 
067271.0%
 
151< 0.1%
 
3157< 0.1%
 
428< 0.1%
 
546< 0.1%
 
7106< 0.1%
 
918< 0.1%
 
1074< 0.1%
 
1130< 0.1%
 
1618< 0.1%
 
ValueCountFrequency (%) 
999853< 0.1%
 
99971< 0.1%
 
9995152< 0.1%
 
999426< 0.1%
 
999095< 0.1%
 
99882< 0.1%
 
9985100< 0.1%
 
998414310.2%
 
99839< 0.1%
 
99825190.1%
 

transactionType
Categorical

Distinct count3
Unique (%)< 0.1%
Missing589
Missing (%)0.1%
Memory size4.9 MiB
PURCHASE
608685
ADDRESS_VERIFICATION
 
16478
REVERSAL
 
16162
ValueCountFrequency (%) 
PURCHASE60868594.8%
 
ADDRESS_VERIFICATION164782.6%
 
REVERSAL161622.5%
 
(Missing)5890.1%
 

Length

Max length20
Median length8
Mean length8.303453422
Min length3

Overview of Unicode Properties

Unique unicode characters19
Unique unicode categories (?)3
Unique unicode scripts (?)2
Unique unicode blocks (?)1
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
R67396512.6%
 
E67396512.6%
 
A65780312.3%
 
S65780312.3%
 
C62516311.7%
 
P60868511.4%
 
U60868511.4%
 
H60868511.4%
 
I494340.9%
 
D329560.6%
 
V326400.6%
 
_164780.3%
 
F164780.3%
 
T164780.3%
 
O164780.3%
 
N164780.3%
 
L161620.3%
 
n1178< 0.1%
 
a589< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter531185899.7%
 
Connector Punctuation164780.3%
 
Lowercase Letter1767< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
R67396512.7%
 
E67396512.7%
 
A65780312.4%
 
S65780312.4%
 
C62516311.8%
 
P60868511.5%
 
U60868511.5%
 
H60868511.5%
 
I494340.9%
 
D329560.6%
 
V326400.6%
 
F164780.3%
 
T164780.3%
 
O164780.3%
 
N164780.3%
 
L161620.3%
 

Most frequent Connector Punctuation characters

ValueCountFrequency (%) 
_16478100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n117866.7%
 
a58933.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin531362599.7%
 
Common164780.3%
 

Most frequent Latin characters

ValueCountFrequency (%) 
R67396512.7%
 
E67396512.7%
 
A65780312.4%
 
S65780312.4%
 
C62516311.8%
 
P60868511.5%
 
U60868511.5%
 
H60868511.5%
 
I494340.9%
 
D329560.6%
 
V326400.6%
 
F164780.3%
 
T164780.3%
 
O164780.3%
 
N164780.3%
 
L161620.3%
 
n1178< 0.1%
 
a589< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
_16478100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII5330103100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
R67396512.6%
 
E67396512.6%
 
A65780312.3%
 
S65780312.3%
 
C62516311.7%
 
P60868511.4%
 
U60868511.4%
 
H60868511.4%
 
I494340.9%
 
D329560.6%
 
V326400.6%
 
_164780.3%
 
F164780.3%
 
T164780.3%
 
O164780.3%
 
N164780.3%
 
L161620.3%
 
n1178< 0.1%
 
a589< 0.1%
 

isFraud
Boolean

Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size627.0 KiB
False
630612
True
 
11302
ValueCountFrequency (%) 
False63061298.2%
 
True113021.8%
 

currentBalance
Real number (ℝ≥0)

ZEROS

Distinct count406990
Unique (%)63.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4044.3820349922275
Minimum0.0
Maximum47496.5
Zeros33622
Zeros (%)5.2%
Memory size4.9 MiB

Quantile statistics

Minimum0
5-th percentile0
Q1502.4425
median2151.86
Q35005.89
95-th percentile13782.554
Maximum47496.5
Range47496.5
Interquartile range (IQR)4503.4475

Descriptive statistics

Standard deviation5945.510224
Coefficient of variation (CV)1.470066421
Kurtosis17.3694379
Mean4044.382035
Median Absolute Deviation (MAD)1893.22
Skewness3.600021658
Sum2596145450
Variance35349091.82
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%) 
0336225.2%
 
53.5221< 0.1%
 
40.4819< 0.1%
 
45.5318< 0.1%
 
53.8618< 0.1%
 
21.3717< 0.1%
 
6.0316< 0.1%
 
63.0116< 0.1%
 
6.2416< 0.1%
 
6.9616< 0.1%
 
29.7216< 0.1%
 
14.5215< 0.1%
 
7.6715< 0.1%
 
5.3315< 0.1%
 
118.2215< 0.1%
 
35.7115< 0.1%
 
31.4115< 0.1%
 
214.7214< 0.1%
 
91.0614< 0.1%
 
5.3814< 0.1%
 
35.0614< 0.1%
 
16.1214< 0.1%
 
7.2314< 0.1%
 
6.7714< 0.1%
 
27.0514< 0.1%
 
Other values (406965)60791794.7%
 
ValueCountFrequency (%) 
0336225.2%
 
0.011< 0.1%
 
0.031< 0.1%
 
0.042< 0.1%
 
0.054< 0.1%
 
0.064< 0.1%
 
0.071< 0.1%
 
0.086< 0.1%
 
0.093< 0.1%
 
0.114< 0.1%
 
ValueCountFrequency (%) 
47496.51< 0.1%
 
47496.341< 0.1%
 
47494.261< 0.1%
 
47491.521< 0.1%
 
47490.971< 0.1%
 
47490.731< 0.1%
 
47490.661< 0.1%
 
47483.991< 0.1%
 
47481.131< 0.1%
 
47480.981< 0.1%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size627.0 KiB
False
340453
True
301461
ValueCountFrequency (%) 
False34045353.0%
 
True30146147.0%
 
Distinct count2
Unique (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size627.0 KiB
False
640945
True
 
969
ValueCountFrequency (%) 
False64094599.8%
 
True9690.2%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

accountNumbercustomerIdcreditLimitavailableMoneytransactionDateTimetransactionAmountmerchantNameacqCountrymerchantCountryCodeposEntryModeposConditionCodemerchantCategoryCodecurrentExpDateaccountOpenDatedateOfLastAddressChangecardCVVenteredCVVcardLast4DigitstransactionTypeisFraudcurrentBalancecardPresentexpirationDateKeyInMatch
073349377273349377250005000.002016-01-08T19:04:50111.33LyftUSUS0501rideshare04/20202014-08-032014-08-034924929184PURCHASETrue0.00FalseFalse
173349377273349377250004888.672016-01-09T22:32:3924.75UberUSUS0901rideshare06/20232014-08-032014-08-034924929184PURCHASEFalse111.33FalseFalse
273349377273349377250004863.922016-01-11T13:36:55187.40LyftUSUS0501rideshare12/20272014-08-032014-08-034924929184PURCHASEFalse136.08FalseFalse
373349377273349377250004676.522016-01-11T22:47:46227.34LyftUSUS0201rideshare09/20292014-08-032014-08-034924929184PURCHASETrue323.48FalseFalse
473349377273349377250004449.182016-01-16T01:41:110.00LyftUSUS0201rideshare10/20242014-08-032014-08-034924929184ADDRESS_VERIFICATIONFalse550.82FalseFalse
573349377273349377250004449.182016-01-16T21:35:279.80Fresh eCardsUSUS0501online_gifts02/20212014-08-032014-08-034924929184PURCHASEFalse550.82FalseFalse
673349377273349377250004439.382016-01-24T07:54:01247.99UberNaNUS0501rideshare01/20262014-08-032014-08-034924929184PURCHASEFalse560.62FalseFalse
773349377273349377250004191.392016-01-26T05:28:240.00Universe Massage #95463USUS0501personal care12/20312014-08-032014-08-034924929184ADDRESS_VERIFICATIONFalse808.61FalseFalse
873349377273349377250004191.392016-01-26T12:18:1411.54Universe Massage #70014USUS0501personal care04/20242014-08-032014-08-034924929184PURCHASEFalse808.61TrueFalse
973349377273349377250004179.852016-01-26T12:19:1511.54Universe Massage #70014USUS0501personal care04/20242014-08-032014-08-034924929184REVERSALFalse820.15TrueFalse

Last rows

accountNumbercustomerIdcreditLimitavailableMoneytransactionDateTimetransactionAmountmerchantNameacqCountrymerchantCountryCodeposEntryModeposConditionCodemerchantCategoryCodecurrentExpDateaccountOpenDatedateOfLastAddressChangecardCVVenteredCVVcardLast4DigitstransactionTypeisFraudcurrentBalancecardPresentexpirationDateKeyInMatch
64190418677039918677039975003619.562016-11-04T01:33:345.37Apple iTunesUSUS0508mobileapps01/20302015-11-042016-06-031271275432PURCHASEFalse3880.44FalseFalse
64190518677039918677039975003614.192016-11-07T20:48:59147.97Blue Mountain Online ServicesUSUS0201online_gifts05/20302015-11-042016-06-031271275432PURCHASEFalse3885.81FalseFalse
64190618677039918677039975003466.222016-11-12T11:02:33883.79Fresh Online ServicesUSUS0901online_gifts11/20292015-11-042016-06-031271275432PURCHASEFalse4033.78FalseFalse
64190718677039918677039975002582.432016-11-17T06:45:5816.31abc.comUSUS0908online_subscriptions11/20292015-11-042016-06-031271275432PURCHASEFalse4917.57FalseFalse
64190818677039918677039975002566.122016-11-18T19:50:4517.10Next Day Online ServicesUSUS0501online_gifts05/20252015-11-042016-06-031271275432PURCHASEFalse4933.88FalseFalse
64190918677039918677039975002574.022016-12-04T12:29:215.37Apple iTunesUSUS0508mobileapps01/20302015-11-042016-06-031271275432PURCHASEFalse4925.98FalseFalse
64191018677039918677039975002568.652016-12-09T04:20:35223.70Blue Mountain eCardsUSUS0901online_gifts05/20262015-11-042016-06-031271275432PURCHASEFalse4931.35FalseFalse
64191118677039918677039975002344.952016-12-16T07:58:23138.42Fresh FlowersUSUS0201online_gifts10/20192015-11-042016-06-031271275432PURCHASEFalse5155.05FalseFalse
64191218677039918677039975002206.532016-12-19T02:30:3516.31abc.comUSUS0908online_subscriptions11/20292015-11-042016-06-031271275432PURCHASEFalse5293.47FalseFalse
64191318677039918677039975002190.222016-12-28T11:14:1432.53Next Day Online ServicesUSUS0901online_gifts08/20252015-11-042016-06-031271275432PURCHASEFalse5309.78FalseFalse